Towards Ontology-based Information Extraction and Annotation of Paper Documents for Personalized Knowledge Acquisition

نویسندگان

  • Benjamin Adrian
  • Heiko Maus
  • Malte Kiesel
  • Andreas Dengel
چکیده

Despite the advent of electronic personal information management (PIM) tools, knowledge workers are still heavily using paper-based information sources. But up to now, even in sophisticated tools for PIM such as the Semantic Desktop, the knowledge workers’ paper world is still neglected. Thus, electronic archiving of a web page for later reference is much easier than taking care of an interesting article in a magazine—whose copy might set dust on the user’s shelf and will long be forgotten when it would be helpful for a specific task. This paper presents how to use document analysis, ontology-based information extraction, and annotation techniques for personal knowledge acquisition in order to bridge the gap between the user’s paper world and his personal knowledge space in the Semantic Desktop. A recent prototype shows the feasibility of the approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Knowledge Acquisition and Integration Technique: Application to Large Scale Taxonomy Extraction and Document Annotation

We present new results of our research on integration of ontologies created automatically by means of Human Language Technologies. The research is related to OLE (Ontology LEarning) – a project aimed at bottom-up generation and merging of ontologies. It utilises a proposal of expressive uncertain knowledge representation framework called ANUIC (Adaptive Net of Universally Interrelated Concepts)...

متن کامل

Linguistic Annotation for the Semantic Web

Establishing the semantic web on a large scale implies the widespread annotation of web documents with ontology-based knowledge markup. For this purpose, tools have been developed that allow for semi-automatic annotation of web documents with ontology-based metadata. However, given that a large number of web documents consist either fully or at least partially of free text, language technology ...

متن کامل

Towards a System for Ontology-Based Information Extraction from PDF Documents

Ontologies enable to directly encode domain knowledge in software applications, so ontology-based systems can exploit the meaning of information for providing advanced and intelligent functionalities. One of the most interesting and promising application of ontologies is information extraction from unstructured documents. In this area the extraction of meaningful information from PDF documents ...

متن کامل

Automatic Ontology-Based Knowledge Extraction from Web Documents

these documents contain. Manual annotation is impractical and unscalable, and automatic annotation tools remain largely undeveloped. Specialized knowledge services therefore require tools that can search and extract specific knowledge directly from unstructured text on the Web, guided by an ontology that details what type of knowledge to harvest. An ontology uses concepts and relations to class...

متن کامل

An Ontology Based Automatic Annotation and Semantic Information Retrieval from Tamil Documents

The use of Ontologies to overcome the limitations of keyword-based search has been put forward as one of the motivations of the Semantic IR. We propose a model, which includes an ontology-based scheme for the automatic annotation of Tamil documents and a retrieval system. The retrieval model is based on an adaptation of the classic vector-space model, including an annotation weighting algorithm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009